fi 


AD-A285  209  CUMENTAT10N  PAGE 


Form  Appr ova*  ' 
CM*  rVa.  370*4 1 8* 


1.  AGENCY  USE  UNtT  («.««*•  0<an«>  REPORT  OATI 

6/20/94 


.•non  •»  «««••«••  <0  **«*•*•  1  *ow  s*f  'Wmkm.  w«  tim#  'Of  '"0  '"WWicbow.  icortnino  (mono  o»«  'owrtt*. 

moitciiMi  m  f0*'O**^O  w  couocnon  of  imwiimioi  Sooo  coootnn  'f oifotoo  Wo  ®if00o  moitt  of  no  roof  moor  of  Wf 
.  foouano  tm«  ouroOT.  :o  <v  mm  not  on  ■onoimin  tofwcot.  Otroeiofoto  tof  mfonomon  Ooorooow  no  Moom.  mi  joftonoff 
02.  *oo  toinoOff^oo*  wjfMffffi  »**o  tuooof.  'lamwi  ffooction  motor  10 704-01  Ml.  WMMnqrox.  0C  19101. 

“  |  j.  report  oate  w  j.  report  type  ano  oates  covered 

6/20/94  Final  4/1/93  -  3/31/94  _ 


X  TITLE  ANO  SUBTITLE 

Numerical  Methods  for  Singularly  Perturbed  Differential 
Equations  with  Applications 

X  FUNOING  NUMBERS  | 

F49620-93-1-0218 

^XAutaORtS) 

Joseph  E.  Flaherty 

7.  PERFORMING  ORGANIZATION  NAME(S)  ANO  AOORESS(E 


Rensselaer  Polytechnic  Institute 
110  Eighth  Street 
Troy,  New  York  12180 

7.  SPONSORING/ mSmTONNG  AGENCY  NAME(S)  ANO  AOORESS(ES) 

Air  Force  Office  of  Scientific  Research 

Computational  Mathematics 

Bolling  Air  Force  Base,  DC  20332-6448 

11.  SUmEMENTARY  NOTES 


12*.  OtSTRHUTION  /  AVAILABILITY  STATEMENT 

unlimited 


REPORT  NUMBER 


9  4  0  5V  3 


10.  SPONSORING/  MONITORING 
AGENCY  REPORT  NUMBER 


IX  ABSTRACT  (Maximum  iOO  worUU 

During  this  one-year  project,  we  continued  our  research  on  the  development,  analysis, 
and  application  of  serial  and  parallel  adaptive  computational  strategies  for  solving  tran¬ 
sient  and  steady  partial  differential  systems.  We  concentrated  on  high-order  methods 
and  adaptive  approaches  that  unite  mesh  refinement  and  coarsening  (h-refinement), 
order  variation  (p-refinement),  and  mesh  motion  (r-refinement).  Parallel  computational 
techniques  involved  load-balancing  and  load-redistribution  strategies  for  implementing 
these  adaptive  methods  on  distributed-memory  MIMD  computers.  In  particular,  we 
have  developed  migration  strategies  that  exchange  finite  elements  between  neighboring 
spatial  domains  of  different  processors.  Effective  load  balancing  in  an  adaptive  setting 
requires  speedy  procedures  since  balancing  must  be  performed  frequently.  Migration 
offers  several  advantages  in  this  regard  since  it  (i)  has  a  low  unit  cost,  (ii)  can  take 
advantage  of  locality,  and  (iii)  can  improve  communications  volumes.  Procedures 
tested  in  two  dimensional  situations  are  being  extended  to  three  dimensions  and  prel- 


14.  SUBJECT 

adaptive  methods,  singularly  perturbed  equations,  partial 
differential  equations,  parallel  computation 

17.  SECURITY  CLASSIFICATION  IX  SECURITY  CLASSIFICATION  |  It.  OASSUHCATIo'n’ 

3E  REPORT  OP  THIS  PAGE  I  OP  ABSTRACT 

unclassified  unclassified  1  unclassified 


IX  SECURITY  CLASSIFICATION 
OF  THIS  FAGS 

unclassified 


IX  NUMBER  OP  PAGES 

20 


IX  PRICE  COOE 


I IX  LIMITATION  OF  ABSTRACT 


NSN 


5JO-01-;  30-5500 


Standard  Form  /9*  (Rr*  /  AS) 

>~vron  a*  anv  ita.  23*. '• 


GENERAL  INSTRUCTIONS  FOR  COMPLETING  SF  298 


The  Report  Documentation  Page  (ROP)  is  used  in  announcing  and  cataloging  reports,  it  is  important 
that  this  information  be  consistent  with  the  rest  of  the  report,  particularly  the  cover  and  title  page. 
Instructions  for  filling  in  each  block  of  the  form  follow,  it  is  important  to  stay  within  the  linn  to  meet 
optical  scanning  requirements. 


Block  1.  Aaencv  Use  Only  (Leave  blank). 

Block  2.  Report  Date.  Pull  publication  date 
including  day.  month,  and  year,  if  available  (e.g.  1 
Jan  88).  Must  cite  at  least  the  year. 

Block  3.  Type  of  Report  and  Oates  Covered. 
State  whether  report  is  interim,  final,  etc  If 
applicable,  enter  inclusive  report  dates  (e.g.  1 0 
Jun87- 30  Jun  88). 

Block  4.  Title  and  Subtitle.  A  title  is  taken  from 
the  part  of  the  report  that  provides  the  most 
meaningful  and  complete  information.  When  a 
report  is  prepared  in  more  than  one  volume, 
repeat  the  primary  title,  add  volume  number,  and 
include  subtitle  for  the  specific  volume.  On 
classified  documents  enter  the  title  classification 
in  parentheses. 

Blocks.  Funding  Numbers.  To  include  contract 
and  grant  numbers:  may  include  program 
element  numbers),  project  numbers),  task 
numbers),  and  work  unit  numbers).  Use  the 
following  labels: 


Block  12a.  Oistnbution/Availabilitv  Statement;. 
Denotes  public  availability  or  limitations.  Cte  any 
availability  to  the  public  Enter  additional 
limitations  or  special  markings  in  ail  capitals  (e.g. 
NOFORN,  REL,  ITAR). 

000  -  See  OoOO  5230. 24,  'Distribution 
Statements  on  Technical 
Documents.* 

DOE  -  See  authorities. 

NASA-  See  Handbook  NH8  2200.2. 

NTIS  -  Leave  blank. 


Block  12b.  Distribution  Code. 

000  -  Leave  blank. 

DOE  *  Enter  OOE  distribution  categories 
from  the  Standard  Distribution  for 
Unclassified  Scientific  and  Technical 
Reports. 

NASA  •  Leave  blank. 

NTIS  -  Leave  blank. 


C  -  Contract 

PR  -  Project 

G  •  Grant 

TA  -  Task 

PE  -  Program 

WU  -  Work  Unit 

Element 

Accession  No. 

Block  13.  Abstract.  Include  a  bnef  (Maximum 
200  words)  factual  summary  of  the  most 
significant  information  contained  in  the  report. 


Block  6.  Authorfs).  Name<s)  of  person(s) 
responsible  for  writing  the  report,  performing 
the  research,  or  credited  with  the  content  of  the 
report.  If  editor  or  compiler,  this  should  follow 
the  name<s). 

Block  7.  Performing  Organization  Name(s)  and 
Addressees).  Self-explanatory. 

Block  8.  Performing  Organization  Report 
Number.  Enter  the  unique  alphanumeric  report 
numbers)  assigned  by  the  organization 
performing  the  report. 

Btock9.  Soonsoring/Monitohno  Aoencv  Nameis) 
and  Addressees).  Self-explanatory. 

Block  10.  Soonsorino/Monitorino  Aoencv 
Report  Number.  (If  known) 


Block  14.  Subject  Terms.  Keywords  or  phrases 
identifying  major  subjects  in  the  report. 

Block  15.  Number  of  Pages.  Enter  the  total 
number  of  pages. 

Block  16.  Price  Code.  Enter  appropriate  price 
code  (NTIS  only). 

Blocks  17. -19.  Security  Oassifications.  Self- 
explanatory.  Enter  U.S.  Secunty  Classification  in 
accordance  with  U.S.  Security  Regulations  (i.e., 
UNCLASSIFIED).  If  form  contains  classified 
i  nformation,  stamp  classification  on  the  top  and 
bottom  of  the  page. 


Block  11.  Supplementary  Notes.  Enter 
information  not  included  elsewhere  such  as: 
Prepared  in  cooperation  with...;  Trans,  of...;  To  be 
published  in —  When  a  report  is  revised,  include 
a  statement  whether  the  new  report  supersedes 
or  supplements  the  older  report. 


Block  20.  Limitation  of  Abstract.  This  block  must 
be  completed  to  assign  a  limitation  to  the 
abstract.  Enter  either  UL  (unlimited)  or  SAR  (same 
as  report).  An  entry  in  this  block  is  necessary  if 
the  abstract  is  to  be  limited.  If  blank,  the  abstract 
is  assumed  to  be  unlimited. 


Standard  Form  294  Sack  (Rev.  2- 09) 


THIS  DOCUMENT  IS  BEST 
QUALITY  AVAILABLE.  THE  COPY 
FURNISHED  TO  DTIC  CONTAINED 
A  SIGNIFICANT  NUMBER  OF 
COLOR  PAGES  'WHICH  DO  NOT 
REPRODUCE  LEGIBLY  ON  BLACK 
AND  WHITE  MICROFICHE. 


FINAL  SCIENTIFIC  REPORT 


Air  Force  Office  of  Scientific  Research  Grant  F49620-93-1-0218 


Period:  1  April  93  through  31  March  1994 

Title  of  Research:  Numerical  Methods  for  Singularly 

Perturbed  Differential  Equations 
with  Applications 

Principal  Investigator:  Joseph  E.  Flaherty 


Department  of  Computer  Science 
Rensselaer  Polytechnic  Institute 


V 


Troy,  New  York  12180 


-  2  - 

ABSTRACT 


During  this  one-year  project,  we  continued  our  research  on  the  development,  analysis, 
and  application  of  serial  and  parallel  adaptive  computational  strategies  for  solving  transient 
and  steady  partial  differential  systems.  We  concentrated  on  high-order  methods  and  adap¬ 
tive  approaches  that  unite  mesh  refinement  and  coarsening  (h-refinement),  order  variation 
(p-refinement),  and  mesh  motion  (r-refinement).  Parallel  computational  techniques 
involved  load-balancing  and  load-redistribution  strategies  for  implementing  these  adaptive 
methods  on  distributed-memory  MIMD  computers.  In  particular,  we  have  developed 
migration  strategies  that  exchange  finite  elements  between  neighboring  spatial  domains  of 
different  processors.  Effective  load  balancing  in  an  adaptive  setting  requires  speedy  pro¬ 
cedures  since  balancing  must  be  performed  frequently.  Migration  offers  several  advan¬ 
tages  in  this  regard  since  it  (i)  has  a  low  unit  cost,  (ii)  can  take  advantage  of  locality,  and 
(iii)  can  improve  communications  volumes.  Procedures  tested  in  two  dimensional  situa¬ 
tions  are  being  extended  to  three  dimensions  and  preliminary  performance  results  are 
encouraging. 


We  have  initiated  an  effort  to  use  our  adaptive  software  to  address  problems  associ¬ 
ated  with  the  manufacture  of  ceramic  composites.  During  the  year,  we  focused  on  model¬ 
ing  and  solving  problems  involving  “reactive  vapor  infiltration  (RVI).’’  With  RVI,  the 
space  surrounding  a  fibrous  preform  is  filled  with  a  powder  that  is  infiltrated  with  a  gas  to 
react  and  form  the  matrix.  The  RVI  process  is  faster  than  competing  techniques  (e.g., 
CVD,  CVI)  and  offers  the  promise  of  reduced  fabrication  costs.  Modeling,  thus  far,  has 
involved  mass  and  momentum  considerations  for  a  porous  continuum.  Volume  expansion, 
as  a  result  of  chemical  reaction,  is  included  in  the  model  to  account  for  pores  between 
powder  grains  closing,  global  volume  expansion  or  contraction,  and  stress  concentration  or 
cracking.  Results  compare  favorably  with  experiment  and  our  computational  investigation 
is  motivating  improved  fabrication  techniques. 
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1.  Research  Status  and  Key  Results 

During  the  past  year  we  continued  our  investigations  of  adaptive  techniques  for  (i) 
transient  parabolic  and  hyperbolic  partial  differential  systems  and  (ii)  massively  parallel 
computer  systems.  We  also  initiated  a  new  effort  to  apply  these  computational  strategies 
to  problems  arising  in  composite  material  fabrication.  Highlights  of  our  recent  accom¬ 
plishments  include: 

i.  Developed  the  first  adaptive  hp-refinement  technique  for  parabolic  partial  differential 
systems  [3,  7,  13].1 

ii.  Developed  efficient  a  posteriori  error  estimation  techniques  for  parabolic  [2,  3]  and 
hyperbolic  [6]  systems. 

iii.  Developed  load  balancing  procedures  for  adaptive  computation  on  distributed- 
memory  parallel  computers  [4-6,  8]. 

iv.  Developed  high-order  methods  for  hyperbolic  [5,  6]  and  singularly-perturbed  para¬ 
bolic  systems  [10]. 

v.  Developed  mathematical  models  for  ceramic  composite  fabrication  by  a  reactive 
vapor  infiltration  process  [9]. 

vi.  Applied  adaptive  techniques  to  compressible  flow  [11]  and  materials  fabrication  [9] 
problems. 

Detailed  descriptions  of  these  endeavors.  Lists  of  external  interactions  and  publica¬ 
tions  appear  in  Sections  2  and  3,  respectively. 

1.1.  hp-Refinement  and  Error  Estimation 

Adaptive  method  of  lines  (MOL)  procedures  utilize  the  power  of  sophisticated  ordi¬ 
nary  differential  equations  software  and  avoid  developing  special  techniques  for  time 
integration.  Using  backward  difference  software  and  our  variant  of  the  singly  implicit 
Runge-Kutta  (SIRK)  method  [3,  13],  we  developed  the  first  adaptive  hp-refinement  finite 
element  strategy  for  parabolic  systems  [3,  7,  13].  Estimates  of  spatial  discretization  errors, 
obtained  using  p-refinement  to  create  and  solve  local  problems  having  forcing  proportional 
to  elemental  or  edge  residuals  of  solutions,  are  efficient  and  converge  to  the  actual  finite 
element  spatial  errors  at  the  correct  rate  [2].2  The  temporal  variation  of  spatial  errors  may 
be  neglected;  hence,  error  estimates  may  be  determined  by  solving  local  elliptic  instead  of 

1  Numbers  in  brackets  correspond  to  publications  and  manuscripts  in  Section  3. 

2  A  two-dimensional  analysis  is  being  described  in  a  manuscript  under  preparation  “A  Posteriori  Error 
Estimation  for  the  Finite  Element  Method  of  Lines  Solution  of  Parabolic  Systems,”  by  S.  Adjerid,  I.  Babus- 
ka,  and  J£.  Flaherty. 
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local  parabolic  problems.  In  typical  applications,  this  reduces  the  time  required  to  obtain 
an  error  estimate  by  approximately  thirty  percent. 

With  a  goal  of  developing  adaptive  local  refinement  methods  (LRMs),  where  local  h- 
and  p-refinement  occurs  in  both  space  and  time,  we  developed  a  variant  of  the  SIRK 
method  that  offers  several  advantages  when  used  with  hp-refinement  [3,  13].  SIRKs  have 
high-stage  order.  A-  and  L-stability,  efficiencies  close  to  those  of  backward  difference 
software,  embedded  error  estimates,  and  a  locality  that  is  suitable  for  parallel  computation. 
Temporal  error  estimates  of  each  SIRK  stage  may  be  computed  for  the  cost  of  one  addi¬ 
tional  function  evaluation  [3];  thus,  results  at  partial  time  steps  may  be  accepted  [13]. 

Spatial  error  estimates  obtained  by  p-refinement  [2,  7]  have  a  limited  range  of  appli¬ 
cability  when  applied  to  singularly-perturbed  systems.  Furthermore,  solutions  obtained  by 
Galerkin  techniques  produce  spurious  oscillations  when  the  computational  mesh  is  too 
coarse  within  boundary  and  interior  layers.  Thus,  without  an  a  priori  knowledge  of  layer 
locations,  an  adaptive  system  would  produce  a  preliminary  coarse-mesh  solution  contain¬ 
ing  spurious  large-amplitude  oscillations.  Even  with  restricted  range,  an  error  estimator 
would  (correctly)  indicate  enrichment  in  oscillatory  regions.  A  correct  solution  could  pos¬ 
sibly  be  obtained  through  this  enrichment;  however,  it  would  hardly  be  optimal  since  an 
excessively  fine  mesh  would  be  needed  to  eliminate  anomalous  oscillations.  Traditional 
approaches  for  overcoming  this  difficulty  add  artificial  viscosity  to  the  solution,  thus, 
widening  boundary  layers  to  the  current  mesh  width.  These  techniques  are  mostly  suc¬ 
cessful  with  low-order  methods  for  convection-diffusion  systems;  hence,  they  would 
require  substantial  modification  to  a  high-order  adaptive  software  system. 

Using  singular-perturbation  theory,  Adjerid  et  al.  [10]  developed  a  strategy  for  stabil¬ 
izing  solutions  of  singular-perturbation  problems  by  changing  the  quadrature  rule  used  to 
evaluate  inner  products.  Among  other  things,  they  show  that  Radau  and  Lobatto  quadra¬ 
ture,  respectively,  produce  stable  solutions  of  convection-  and  reaction-diffusion  problems. 
These  minor  modifications  to  an  adaptive  software  system  confine  large  errors  to  layers 
regardless  of  whether  or  not  their  locations  are  known.  The  framework,  furthermore,  indi¬ 
cates  how  to  develop  quadrature  rules  for  other  singular  perturbations. 

Biswas  et  al.  [6]  and  Devine  et  al.  [5]  developed  superior  solut’on  limiting  and  error 
estimation  schemes  for  a  local  finite  element  technique  of  Cockbum  and  Shu.3  When  used 
to  solve  hyperbolic  systems  of  conservation  laws,  the  new  limiting  strategies  preserve  an 
essentially  non-oscillatory  behavior  near  discontinuities  while  maintaining  a  high  order  of 

3  B.  Cockbum  and  C.-W.  Shu,  TVB  Runge-Kutta  Local  Projection  Discontinuous  Finite  Element  Method 
for  Conservation  Laws  II:  General  Framework,  Maths.  Comp.,  52  (1989),  411-435. 

B.  Cockbum  and  C.-W.  Shu,  TVB  Runge-Kutta  Local  Projection  Discontinuous  Finite  Element  Method  for 
Conservation  Laws  III:  One-Dimensional  Systems,  J.  Comput.  Phys.,  84  (1989),  90-113. 
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accuracy  in  smooth  regions.  Following  the  ideas  of  Adjerid  et  al.  [10],  spatial  error  esti¬ 
mates  utilize  a  p-refinement  approach  with  supeiconvergence  at  Radau  points  to  compute 
efficient  and  (apparently)  asymptotically  correct  results  [3,  6]. 

1.2.  Parallel  Procedures 

Our  recent  research  on  load  balancing  schemes  for  adaptive  methods  is  intended  for 
use  with  distributed-memory  MIMD  computation.  An  effective  load  balancing  strategy 
should  minimize  the  total  execution  time  including  the  computational  time,  the  communi¬ 
cations  time,  and  the  time  spent  on  reassigning  processors  to  different  spatial  regions  as  a 
result  of  adaptivity.  Thus,  existing  strategies  that  minimize  communication  on  a  fixed 
mesh  will  be  sub-optimal  because  of  their  high  per-step  expense.  Processor  scheduling 
and  load  balancing  must  be  dynamic  and  have  a  low  unit  cost  and  high  parallel  perfor¬ 
mance.  Thus,  we  utilize  incremental  migration  techniques  where  finite  elements  are  inter¬ 
changed  between  processors  as  the  solution  evolves  and  processor  demands  change. 

Tiling  [5,  6]  is  a  modification  of  a  global  load  balancing  technique  of  Leiss  and 
Reddy4  who  balanced  local  loading  within  overlapping  processor  neighborhoods.  How¬ 
ever,  rather,  than  using  hardware  interconnections,  tiling  neighborhoods  are  defined  as  a 
processor  at  the  center  of  a  circle  of  some  predefined  radius,  all  other  processors  within 
this  circle,  and  all  processors  having  finite  elements  that  are  neighbors  of  elements  in  the 
central  processor.  Each  processor  is  the  center  of  a  neighborhood  and  loading  is  balanced 
within  neighborhoods  using  local  performance  measurements.  Work  is  migrated  from  a 
processor  to  any  other  processor  within  the  same  neighborhood.  A  priority  system  favors 
migration  of  elements  having  neighbors  in  the  importing  processor.  We  illustrate  some 
performance  characteristics  through  an  example. 

Example  1.  We  solved  a  linear  advection  equation  on  a  two-dimensional  rectangular 
domain  on  a  1024-processor  nCUBIi/2  hypercube  at  Sandia  National  Laboratories  using 
the  local  finite  element  method  [5,  6]  with  fixed-order  (two)  and  adaptive  p-refinement. 
The  mesh  for  the  fixed-order  computation  was  selected  so  that  its  solution  and  the  adap¬ 
tive  solutions  had  approximately  the  same  error,  however,  the  resulting  error  of  the  fixed- 
order  solution  was  66%  greater  than  that  of  the  adaptive  solutions  (cf.  Table  1).  The  solu¬ 
tion  of  the  advection  equation  involved  a  relatively  steep  wave  front  that  propagated 
across  the  domain  at  an  oblique  angle;  thus,  processor  load  balancing  was  necessary 
throughout  the  course  of  the  computation. 

Performance  statistics,  presented  in  Table  1,  indicate  that  adaptivity  reduced  the  total 
execution  of  the  fixed-order  method  by  68%  with  a  23%  loss  in  scaled  parallel  speed  up 

4  E.  Leiss  and  H.  Reddy,  Distributed  Load  Balancing:  Design  and  Performance  Analysis,  W.M.  Keck 
Research  Computation  Laboratory,  5  (1989),  205-270. 


Table  1.  Performance  comparison  for  Example  1  using  fixed-order  (two)  and 
adaptive  p-refinement  methods  with  and  without  balancing  through  tiling  at  each 
time  step. 


(parallel  efficiency).  Tiling,  not  necessary  with  fixed  uniform-mesh  computation,  is  clearly 
needed  with  adaptive  techniques.  Adaptive  computational  results  with  tiling  are  48%  fas¬ 
ter  than  those  without  tiling  and  have  an  88%  better  parallel  efficiency. 

While  these  results  are  encouraging,  there  are  several  ways  of  improving  the  tiling 
and  migration  strategies.  Tiling,  for  example,  has  a  slow  distribution  of  balancing  errors, 
it  currently  provides  no  improvement  in  communications  volume,  and  it  is  constrained  to 
two  dimensions.  DeCougny  et  al.  [14]  assemble  the  neighborhood  workload  requests  of 
the  tiling  procedure  [5,  6]  into  a  forest  of  trees.  Migration  strategies  may  either  occur 
between  pairs  of  processors  or  by  balancing  the  trees.  These  techniques  are  being  imple¬ 
mented  in  three  dimensions  and  are  being  interfaced  to  the  SCOREC  (Scientific  Computa¬ 
tion  Research  Center)  mesh  database.  This  mesh  database  controls  mesh  generation,  adap¬ 
tive  h-refinement,  and  domain  decomposition.  Operators  have  been  written  for  the  rapid 
migration  of  elements  between  processor  partitions  and  for  updating  information  across 
partition  boundaries  as  a  result  of  h-refinement. 

Example  2.  The  flow  space  surrounding  half  of  a  tandem  helicopter  has  been  discre¬ 
tized  into  a  mesh  of  94,069  tetrahedral  elements.  The  projection  of  the  mesh  onto  a  plane 
is  shown  in  Figure  2.  Elements  in  this  plane  have  been  colored  according  to  their  parti¬ 
tioning  on  a  16-processor  IBM  SP-1  computer.  Partitioning  was  done  using  inertial 
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Figure  1.  Computational  time  as  a  function  of  the  number  of  partitions  for  the 
tandem  helicopter  mesh  of  Example  2. 

recursive  bisection  (IRB)  with  migration.  The  IRB  algorithm  performs  recursive  bisection 
of  the  volume  of  the  partition  in  planes  parallel  to  the  principal  axes  of  inertia  rather  than 
along  coordinate  directions.  IRB  may  provide  an  effective  low-cost  way  of  controlling 
communications  cost  in  an  adaptive  setting.  Elemental  “masses”  may  be  assigned  in  pro¬ 
portion  to  local  polynomial  degree  to  use  the  procedure  with  adaptive  hp-refinement 

Computational  effort  is  shown  as  a  function  of  the  number  of  partitions  in  Figure  2. 
The  parallel  complexity  of  IRB-migration  strategy  is  linear  in  the  number  of  elements  NA 
and,  hence,  has  a  constant  time  for  a  fixed  mesh.  This  is  verified  by  the  results  shown  in 
Figure  2.  The  cost  of  maintaining  the  data  structure  during  migration  behaves  similarly; 
thus,  the  total  parallel  computational  cost  is  independent  of  the  number  of  processors. 
Hence,  the  partitioning  and  redistribution  algorithms  have  (near)  perfect  scalability. 

Mesh  generation  underlying  the  SCOREC  mesh  database  is  done  by  an  octree 
decomposition  of  physical  space.  This  can  be  exploited  to  construct  partitioning  and  load 
balancing  algorithms  that  are  faster  than  those  for  arbitrary  unstructured  meshes.  Indeed, 
parallel  partitioning  can  be  performed  by  a  two-step  procedure  that  (i)  determines  cost 
metrics  for  all  subtrees  and  (ii)  partitions  the  octree  according  to  these  metrics  [8,  14]. 
The  partitions  consist  of  octants  that  are  each  the  root  of  a  subtree  and  are  determined  by 
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Figure  2.  Tandem  helicopter  and  the  projection  of  a  portion  of  the  mesh  sur¬ 
rounding  it  onto  a  plane.  Colors  indicate  processor  assignments  using  IRB  with 
migration  (cf.  Example  2). 
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a  truncated  depth- first  search. 

Example  3.  The  tree-based  partitioning  strategy  has  been  applied  to  a  series  of  six 
finite-octree  generated  meshes  [14]  involving  flow  solutions  of  the  Euler  equations  about 
airplanes,  helicopters  (cf.  Example  2),  wings,  and  cones.  The  number  of  elements  in  each 
mesh  ranged  from  16,000  to  293,000  with  and  average  of  174,000.  Partition  quality  is 
measured  as  the  percent  of  element  faces  on  inter- partition  boundaries  relative  to  the  total 
number  of  faces  of  the  mesh.  In  Figure  3,  we  display  these  percentages  as  a  function  of 
the  optimal  partition  size.  The  variance  between  partition  costs  is  as  small  as  the  cost  of  a 
leaf  octant  in  all  cases. 


Figure  3.  Performance  of  the  octree  partitioning  algorithm  on  six  meshes. 


The  performance  metric  shown  in  Figure  3  is  a  measure  of  the  total  surface  area  that 
partitions  have  in  common.  Smaller  values  require  less  communication  relative  to  the 
local  volume  of  data.  The  interface  proportion  is  less  'han  12%  when  the  partition  size 
exceeds  1000  elements  and  drops  to  below  9%  for  partitions  of  2000  elements  or  more. 
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This  performance  is  comparable  to  recursive  spectral  bisection  but  has  O(N^)  instead  of 
O  (N% )  complexity.5 

We  use  the  local  finite  element  procedure  [5,  6]  with  adaptive  h-refinement  and  the 
octree  partitioning  technique  to  investigate  the  supersonic  (Mach  two)  flow  over  a  cone. 
Results,  shown  in  Figure  4,  follow  several  steps  of  h-refinement  obtained  by  adding  and 
deleting  octants  to  and  from  the  tree  structure  and  generating  a  new  mesh  locally.  Com¬ 
putations  were  performed  on  a  Thinking  Machines  CM-5  computer  at  the  University  of 
Minnesota.  The  shock  surface  and  pressure  contours  are  shown  in  the  upper  portion  of 
Figure  4  while  planar  projections  of  partitionings  for  16-  and  32-processor  computers  are 
shown  below.  Color  on  partition  diagrams  denotes  membership  in  a  partition.  Use  of  the 
octree  to  balance  processor  loading  is  simple  and  fast.  It  offers  a  great  deal  of  promise 
for  both  adaptive  and  standard  computation. 


1.3.  Composite  Materials  Manufacturing 


We  have  been  working  with  Rensselaer  Materials  Scientists  William  Hillig  and  John 
Hudson  to  study  fabrication  problems  associated  with  ceramic  composites  with  a  goal  of 
manufacturing  high-temperature  composites  more  rapidly  and  less  expensively  than 
currently  possible.  Hillig  and  colleagues6  describe  a  reactive  vapor  infiltration  (RVI)  pro¬ 
cess  for  manufacturing  fiber-reinforced  ceramic  composites  where  silicon  carbide  (S/C)  or 
alumina  (Al203)  fibers  are  mixed  with  molybdenum  ( Mo )  powder  and  pressed  at  room 
temperature  to  form  a  porous  preform.  The  preform  is  exposed  to  a  silicon  tetra-chloride 
(5/C/ 4)  and  hydrogen  (H2)  flow  where  molecular-surface  reactions  liberate  Si  which, 
when  absorbed  into  the  preform,  reacts  with  Mo  to  form  a  molybdenum  di-silicide 
(MoSi  2)  matrix.  Based  on  considerations  of  mass  conservation,  we  developed  the  follow¬ 
ing  model  of  the  diffusion  of  Si  into  a  porous  pellet  to  form  an  intermediate  (A/o55/3)  sili- 
cide  layer  and  the  desired  MoSi  2  matrix: 


P(pr,) 

Dt 


d(pr,) 

dt 


+  vV(prf)  =  V-D,V(pr,)  -  pTjV  v  +  rh 


i  =  1,  2, 


4.  (1) 


The  symbol  p  denotes  the  mixture  density  and  v  denotes  the  mixture  velocity.  The  mass 
fraction,  diffusivity,  and  production  rate  of  species  1  are,  respectively,  denoted  as  Yit  £>,  , 
and  fjt  with  index  /  =  1,  2,  3,  4,  identifying  Si,  MoSi2,  Mo5Si$,  and  Mo.  Reaction  rates 
are  much  faster  than  diffusion  rates  and  are  described  by  Adjerid  et  al.  [9]. 


3  A.  Potben,  H.  Simon,  and  K.-P.  Liou,  “Partitioning  Sparse  Matrices  with  Eigenvectors  of  Graphs,  SIAM 
J.  Matrix  Anal.  Applies.  11  (1990),  430-452. 

6  N.  Patibandla  and  W.B.  Hillig,  “Processing  of  Molybdenum  Di-silicide  Using  a  New  Reactive  Vapor 
Infiltration  Technique,  J.  Am.  Ceram.  Soc.,”  76  (1993),  1630-1634. 
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With  an  initial  Mo  powder  compressed  to  a  porosity  of  45%,  the  siliciding  reactions 
are  accompanied  by  a  158%  volume  increase  that  fills  the  pores  between  grains  of  Mo 
powder,  but  is  excessive  and  may  cause  cracking.  By  a  combination  of  analysis  and 
experimentation,  we  discovered  that  material  expansion  and  production  times  can  be 
reduced  by  initiating  the  process  with  a  mixture  of  powdered  Mo  and  MoSi2 ■  Local  and 
total  volume  expansion  is  modeled  by  approximating  the  reacting  medium  as  a  viscous 
fluid  and  coupling  the  viscous  momentum  equations  to  the  mass  conservation  system 
describing  reaction  and  diffusion.  Thus, 

p^  +  v[a,p  +  V-(pv)]  =  VT  (2a) 

where  the  traction  matrix  T  has  components 

Tij  =  (rP  +  ^Vv)8 ij  +  pf^.v,-  +  dxVj)  (2b) 

with  X  and  p  being  Lam6  parameters,  p  being  the  pressure,  and  8,  j  being  the  Kronecker 
delta. 

A  volumetric  flow  relation  describes  the  behavior  of  the  pores  between  powdered 
grains  and  we  derive  this  as 

+  (Vv)(l  -  v)  =  £  ~[r,  +  V-D, ,Vpy,],  (3) 

m  «=1  Pi 

where  v  is  the  porosity  and  ft,  /  =  1,  2,  3,  4,  are  the  species  densities. 

Example  4.  We  solve  (1-3)  using  our  adaptive  finite  element  MOL  software  and 
report  some  results  obtained  by  Adjerid  et  al.  [9].  Specific  parameter  values  are  given 
therein  and,  for  the  most  part,  are  not  duplicated  here.  We  compare  computed  and 
observed  results  for  the  square  of  the  thickness  of  the  MoSi2  layer  as  a  function  of  time 
for  three  temperatures  in  the  left  portion  of  Figure  5.  Computed  and  experimental  results 
are  in  excellent  agreement  with  deviations  being  less  than  10%. 

Solutions  were  obtained  by  hpr-refinement  and,  in  Figure  6,  we  show  the  spatial 
mesh  used  at  four  different  times.  Colors  correspond  to  concentrations  of  MoSi2  with  red 
denoting  a  high  concentration  and  blue  denoting  a  low  concentration.  The  transition  from 
red  to  blue  corresponds  to  the  formation  of  Mo^Si^  in  a  narrow  layer  between  the  fabri¬ 
cated  MoSi2  (red)  and  the  unreacted  Mo  (blue).  A  coarse  mesh  and  first-order  method  are 
used  away  from  the  reaction  zone  while  finer  meshes  and  high-order  methods  are  used 
near  the  reaction.  The  mesh  is  concentrated  near  the  reaction  zone  and  is  moving  to 
account  for  the  expansion  as  the  reaction  occurs. 

Results  shown  on  the  right  of  Figure  5  compare  computed  and  observed  values  of  the 
square  of  the  thickness  of  the  MoSi2  layer  as  a  function  of  time  for  three  initial  concentra¬ 
tions.  As  noted,  fabrication  times  can  be  shortened  by  initiating  the  reaction  with  a 


Figure  4.  Shock  surface  and  pressure  contours  found  when  computing  the  Mach 
two  flow  past  a  cone  having  a  half  angle  of  10°  (top).  Partitions  of  the  mesh 
into  16  (lower  left)  and  32  (lower  right)  pieces. 
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Figure  5.  Comparison  of  computed  and  observed  values  of  the  square  of  the 
thickness  of  the  MoSi  2  layer  as  a  function  of  time.  On  the  left,  comparisons  for 
temperatures  of  1100°C  (diamonds),  1200  °C  (triangles),  and  1300  °C 
(plusses).  On  the  right,  comparisons  for  initial  concentrations  of  all  Mo  (trian¬ 
gles),  50%  Mo  and  50%  MoSi2  (plusses),  and  30%  Mo  and  70%  MoSi2  (dia¬ 
monds). 

mixture  of  Mo  and  MoSi2.  The  particular  initial  concentration  of  half  Mo  and  half 
MoSi2,  compressed  to  45%  porosity,  fills  pores  to  within  8%  with  (essentially)  no  volume 

expansion.  Our  model  predicted  both  the  correct  growth  rate  of  the  silicide  layer  the  lack 
of  volume  expansion. 

Finally,  in  Figure  7,  we  exhibit  the  thickness  of  the  MoSi2  layer  at  four  times  for  a 
two-dimensional  computation  using  adaptive  h-refinement  Once  again,  high  concentra¬ 
tions  are  colored  red  and  low  ones  are  colored  blue.  The  mesh  used  for  the  computation 
at  the  indicated  time  has  been  superimposed.  The  bottom  and  right  edges  of  each  figure 
are  lines  of  symmetry.  The  silicide  layers  progress  inward  from  the  top  and  left  edges  to 
build  the  MOSi2  and  Mo5Si3  layers.  Reaction  fronts  are  sharp  and  the  mesh  is  concen¬ 
trated  in  and  following  these  regions.  The  Mo5Si3  layer  is  a  narrow  rone  that  precedes 
the  MoSi2  layer  into  the  unreacted  molybdenum. 

Work  on  materials  processing  is  just  beginning  and  is  the  focus  of  our  current  grant 
with  AFOSR  (F49620-94- 1-0200).  Our  intentions  are  to 

i.  Improve  the  mechanical  coupling  model  by  including  elastic  and  plastic  deformation. 

ii.  Develop  microstructure  models  by  studying  the  flow  around  grains  and  fibers  and 
through  pores. 
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iii.  In  collaboration  with  Wright  Laboratories  and  Centric  Engineering,  study  fiber  coat¬ 
ing  processes. 

iv.  Study  the  effects  of  grain  shape,  initial  porosity,  and  initial  composition  in  an  effort 
to  further  reduce  processing  times. 

The  software  and  model  are  not  limited  to  RVI  and  we  will  investigate  its  use  with 
other  fabrication  processes. 
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2.  Interactions 

Invited  lectures  and  other  interactions  by  personnel  involved  with  research  supported 

by  this  grant  follow. 

1.  Flaherty  lectured  on  “Adaptive  and  Parallel  High-Order  Methods  for  Conservation 
Laws,”  University  of  Kentucky,  Lexington,  April  21,  1993. 

2.  Flaherty  attended  the  AFOSR  Contractor! Grantee  meeting  at  Washington  University, 
May  20-21,  1993.  He  lectured  on  “Adaptive  and  Parallel  Computational  Techniques 
for  Partial  Differential  Equations.” 

3.  Flaherty  attended  the  AFOSR  MCAD  Review  at  the  University  of  Michigan,  May  27, 
1993.  He  presented  an  evaluation  of  the  material  to  N.  Glassman  and  C.  Holland  of 
AFOSR. 

4.  Marsha  Berger  of  New  York  University  visited  Flaherty  and  Mark  Shephard,  of 
Rensselaer’s  Scientific  Computation  Research  Center  on  June  16,  1993,  to  discuss 
adaptive  techniques  on  Cartesian  grids. 

5.  Flaherty,  with  Julian  Cole  and  Donald  Drew  of  Rensselaer  and  Bernard  Matkowsky 
of  Northwestern  University  and  Lu  Ting  of  New  York  University,  organized  and  ran 
a  workshop  on  Perturbation  Methods  in  Physical  Mathematics  at  Rensselaer,  June 
23-26,  1993.  The  workshop  honored  Joseph  B.  Keller  and  recognized  his  seventieth 
birthday.  More  than  one  hundred  scientists  attended  the  symposium.  Flaherty  lec¬ 
tured  on  “Adaptive  Numerical  Procedures  for  Singularly  Perturbed  Partial 
Differential  Equations.”  Written  proceedings  of  this  workshop  will  be  published  as  a 
special  issue  of  SIAM  Journal  of  Applied  Mathematics. 

6.  Flaherty  visited  N.  Glassman,  C.  Holland,  and  M.Q.  Jacobs  at  AFOSR  headquarters 
to  discuss  the  use  of  adaptive  methods  for  addressing  materials  processing  problems 
involving  ceramic  composites. 

7.  Flaherty,  with  Ivo  Babuska  of  the  University  of  Maryland,  William  Henshaw  of  IBM, 
John  Hoppcroft  of  Cornell  University,  Joseph  Oliger  of  Stanford  University,  and  Tay- 
fun  Tezduyar  of  the  University  of  Minnesota  organized  a  summer  program  on  Model¬ 
ing,  Mesh  Generation,  and  Adaptive  Numerical  Methods  for  Partial  Differential 
Equations  at  the  Institute  for  Mathematics  and  its  Applications  of  the  University  of 
Minnesota,  Minneapolis,  July  6-23,  1993.  Flaherty  lectured  on  “Adaptive  and  Paral¬ 
lel  High-Order  Methods  for  Conservation  Laws.” 

8.  Flaherty  attended  the  AFOSR  Working  Group  Meeting  on  Modeling  and  Design  for 
High  Pressure  Crystal  Growth  Processes  at  Rome  Laboratories,  Hanscom  AFB,  July 
29-30,  1993. 
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9.  Flaherty  lectured  on  “Computers:  Small  and  Large,  Serial  and  Parallel,”  “An  Intro¬ 
duction  to  the  Parallel  Solution  of  Finite  Element  Problems,”  “Parallel  Solution  Pro¬ 
cedures  on  Shared  and  Distributed  Memory  Computers,”  and  “Adaptive  and  Parallel 
Finite  Element  Methods  for  Conservation  Laws,”  Summer  workshop  on  Frontiers  in 
Finite  Element  Technology,  Gy6r,  August  16-19,  1993.  This  Hungarian-sponsored 
symposium  was  attended  by  faculty  and  graduate  students  from  former  iron-curtain 
countries.  The  three  keynote  lecturers.  Flaherty,  Ivo  Babuska  of  the  University  of 
Maryland,  and  Bama  Szabo  of  Washington  University,  each  presented  a  sequence  of 
four  talks. 

10  Flaherty  lectured  on  “Adaptive  and  Parallel  High-Order  Methods  for  Conservation 
Laws”  at  the  University  of  Toronto,  February  25,  1994. 

11.  Flaherty  and  John  Hudson  of  Rensselaer’s  Materials  Engineering  Department  were  to 
fly  to  Wright  Laboratories  to  meet  with  James  Mai  as  and  Stephen  LeClair  on  March 
4,  1994  to  discuss  mutual  interest  in  ceramic  fiber  coating  processes.  Poor  weather 
forced  the  cancellation  of  their  private  flight,  but  Hudson  attended  the  meeting  via 
commercial  transportation.  There  meeting  resulted  in  the  submission  of  an  STIR 
proposal  on  fiber  coating  processes  with  Rensselaer  and  Centric  Engineering  as 
academic  and  industrial  partners. 

12.  Flaherty  attended  the  Workshop  on  Basic  Phenomena  in  Plasticity  at  Carnegie  Mellon 
University,  March  17-19,  1994.  He  lectured  on  “Adaptive  Methods  for  Partial 
Differential  Equations  with  Application  to  Shear  Band  Formation.” 

13.  Flaherty  presented  a  lecture  on  “Adaptive  Methods  for  Parabolic  Partial  Differential 
Equations”  at  the  University  of  Buffalo,  May  5,  1994. 
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3.  Publications  and  Manuscripts 

Papers  published  or  in  press  and  submitted  manuscripts  on  research  supported  by  this 

contract  follow. 

Publications 

1.  D.C.  Amey,  R.  Biswas,  and  J.E.  Flaherty,  “An  Adaptive  Mesh  Moving  and 
Refinement  Procedure  for  One-Dimensional  Conservation  Laws,”  Appl.  Numer. 
Math.,  11  (1993),  259-282. 

2.  S.  Adjerid,  J.E.  Flaherty,  and  Y.J.  Wang,  “A  Posteriori  Error  Estimation  with  Finite 
Element  Methods  of  Lines  for  One-Dimensional  Parabolic  Systems,”  Numer.  Math., 
65  (1993),  1-21. 

3.  P.K.  Moore  and  J.E.  Flaherty,  “High-Order  Adaptive  Finite  Element-Singly  Implicit 
Runge-Kutta  Methods  for  Parabolic  Differential  Equations,”  BIT,  33  (1993),  309- 
331. 

4.  M.  Benantar,  J.E.  Flaherty,  C.  Ozturan,  M.S.  Shephard,  and  B.K.  Szymanski,  “Paral¬ 
lel  Computation  in  Adaptive  Finite  Element  Analysis,”  Chap.  7  in  C.A.  Brebbia  and 
M.H.  Aliabadi,  Eds.,  Adaptive  Finite  Element  and  Boundary  Element  Methods,  Com¬ 
putational  Mechanics  Publics.,  Southampton,  co-published  with  Elsevier  Applied  Sci¬ 
ence,  London,  1993. 

5.  K.  Devine,  J.E.  Flaherty,  S.  Wheat,  and  A.B.  Maccabe,  “A  Massively  Parallel  Adap¬ 
tive  Finite  Element  Method  with  Dynamic  Load  Balancing,”  Tech.  Rep.  SAND93- 
0936C,  Sandia  National  Laboratories,  1993.  Also,  Proc.  Supercomputing  ‘93,  to 
appear. 

6.  R.  Biswas,  K.D.  Devine,  and  J.E.  Flaherty,  “Parallel  Adaptive  Finite  Element 
Methods  for  Conservation  Laws,”  Appl.  Numer.  Maths.,  14  (1994),  255-284. 

In  Press 

7.  S.  Adjerid,  J.E.  Flaherty,  and  Y.J.  Wang,  “Adaptive  Method-of-Lines  Techniques  for 
Parabolic  Systems,”  Proc.  First  International  Symposium  on  Method  of  Lines,  Sur¬ 
faces  and  Dimensional  Reduction  in  Computational  Mathematics  and  Mechanics, 
Athens,  1994,  to  appear. 

8.  R.  Loy  and  J.E.  Flaherty,  “Parallel  Adaptive  Load-Balancing  Schemes  for  Three- 
Dimensional  Conservation  Laws,”  to  appear  in  W.  Ames,  Ed.,  Proc.  I  MACS  World 
Congress,  1994,  Atlanta. 

9.  S.  Adjerid,  J.E.  Flaherty,  W.  Hillig,  J.  Hudson,  and  M.S.  Shephard,  “Adaptive 
Method  of  Lines  Techniques  for  Vapor  Infiltration  Problems,”  to  appear  in  W.  Ames, 
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Ed.,  Proc  IMACS  World  Congress,  1994,  Atlanta. 

10.  S.  Adjerid,  M.  Aiffa,  and  J.E.  Flaherty,  “High-Order  Finite  Element  Methods  for 
Singularly-Perturbed  Elliptic  and  Parabolic  Problems,”  SIAM  J.  Appl.  Math.,  1994, 
to  appear. 

11.  B.E.  Webster,  M.S.  Shephard,  Z.  Rusak  and  J.  E.  Flaherty,  “An  Automated  Adaptive 
Time-Discontinuous  Finite  Element  Method  for  Unsteady  Compressible  Airfoil  Aero¬ 
dynamics,”  AIAA  Journal,  1994,  to  appear. 

Submitted  Manuscripts 

12.  M.  Benantar,  J.E.  Flaherty,  and  M.S.  Krishnamoorthy,  “Triangle  Graphs,”  1992, 
submitted  for  publication. 

13.  J.E.  Flaherty  and  P.K.  Moore,  “An  hp-Adaptive  Method  in  Space  and  Time  for  Para¬ 
bolic  Systems,”  Tech.  Rep.  93-15,  Dept.  Comp.  Sci.,  Rensselaer  Polytechnic  Insti¬ 
tute,  June  1993.  Also,  submitted  for  publication. 

14.  H.L.  deCougny,  K.D.  Devine,  J.E.  Flaherty,  R.M.  Loy,  C.  Ozturan,  and  M.S. 
Shephard,  “Load  Balancing  for  the  Parallel  Adaptive  Solution  of  Partial  Differential 
Equations,”  Tech.  Rep.  94-8,  Dept.  Comp.  Sci.,  Rensselaer  Polytechnic  Institute, 
December  1994.  Also  submitted  for  publication. 


